Data Quality in the Human and Environmental Health Sciences: Using Statistical Confidence Scoring to Improve QSAR/QSPR Modeling

نویسندگان

  • Fabian P. Steinmetz
  • Judith C. Madden
  • Mark T. D. Cronin
چکیده

A greater number of toxicity data are becoming publicly available allowing for in silico modeling. However, questions often arise as to how to incorporate data quality and how to deal with contradicting data if more than a single datum point is available for the same compound. In this study, two well-known and studied QSAR/QSPR models for skin permeability and aquatic toxicology have been investigated in the context of statistical data quality. In particular, the potential benefits of the incorporation of the statistical Confidence Scoring (CS) approach within modeling and validation. As a result, robust QSAR/QSPR models for the skin permeability coefficient and the toxicity of nonpolar narcotics to Aliivibrio fischeri assay were created. CS-weighted linear regression for training and CS-weighted root-mean-square error (RMSE) for validation were statistically superior compared to standard linear regression and standard RMSE. Strategies are proposed as to how to interpret data with high and low CS, as well as how to deal with large data sets containing multiple entries.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

QSPR Analysis with Curvilinear Regression Modeling and Topological Indices

Topological indices are the real number of a molecular structure obtained via molecular graph G. Topological indices are used for QSPR, QSAR and structural design in chemistry, nanotechnology, and pharmacology. Moreover, physicochemical properties such as the boiling point, the enthalpy of vaporization, and stability can be estimated by QSAR/QSPR models. In this study, the QSPR (Quantitative St...

متن کامل

The Interplay between QSAR/QSPR Studies and Partial Order Ranking and Formal Concept Analyses

The often observed scarcity of physical-chemical and well as toxicological data hampers the assessment of potentially hazardous chemicals released to the environment. In such cases Quantitative Structure-Activity Relationships/Quantitative Structure-Property Relationships (QSAR/QSPR) constitute an obvious alternative for rapidly, effectively and inexpensively generatng missing experimental valu...

متن کامل

Modeling and Performance of Waste Tires as Media in Fixed Bed Sequence Batch Reactor

Introduction: The modeling aims to simulate or optimize a process in physical, chemical or biological environments and the derived model will provide a considerable assistance to generate data and predict unknown condition, in case of sufficient suitability. Unsuitable disposal and elimination of waste tires have polluted the environment and human life areas, it also have caused removal of a hu...

متن کامل

OPERA models for predicting physicochemical properties and environmental fate endpoints

The collection of chemical structure information and associated experimental data for quantitative structure-activity/property relationship (QSAR/QSPR) modeling is facilitated by an increasing number of public databases containing large amounts of useful data. However, the performance of QSAR models highly depends on the quality of the data and modeling methodology used. This study aims to deve...

متن کامل

Prediction and modeling of fluoride concentrations in groundwater resources using an artificial neural network: a case study in Khaf

 Background: One issue of concern in water supply is the quality of water. Measuring the qualitative parameters of water is time-consuming and costly. Predicting these parameters using various models leads to a reduction in related expenses and the presentation of overall and comprehensive statistics for water resource management. Methods: The present study used an artificial neural...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of chemical information and modeling

دوره 55 8  شماره 

صفحات  -

تاریخ انتشار 2015